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Abstract 

Bayesian Poisson probability distributions for n can be analytically converted 
into equivalent chi-squared distributions. These can then be combined with 
other Gaussian or Bayesian Poisson distributions to make a total chi-squared 
distribution. This allows the usual treatment of chi-squared contours but now 
with both Poisson and Gaussian statistics experiments. This is illustrated 
with the case of neutrino oscillations. 



I. INTRODUCTION 



In analyzing the joint probability for mutual experimental results or for parameters, often 
a number of Poisson statistics experiments with a low number of events may be mixed with 
Gaussian experiments with high numbers of events. It is desirable to combine both types in a 
way to maintain the simplicity of a chi-squared distribution for all of the experiments. In this 
paper we show a simple mathematical identity between the Bayesian Poisson distribution 
for the average and an associated chi-squared distribution that allows us to accomplish this. 
We then apply this to the case of neutrino oscillation experiments with few events requiring 
a Poisson treatment and find the form of the addition to x 2 from the Poisson experiments to 
combine with Gaussian treated experiments to form a combined x 2 to study the oscillation 
and mixing parameters. Having achieved the general result of including Poisson experiments 
with Gaussian experiments, we then solve the simplest analytical cases for linear parameter 
dependences in the appendices. 

In section 2 we review the method for joining two chi-squared distributions into a joint 
chi-squared distribution. In section 3 we review using Bayes' theorem to find the Bayesian 
Poisson distribution for the average. In section 4 we show the exact equivalence of the 
Bayesian Poisson distribution for the average to a chi-squared distribution. We also show 
the domain of accuracy when a background is present. In section 5 we derive the joint 
probability distribution for combining a single Bayesian Poisson distribution for the average 
with a chi-squared distribution. In section 6 we then use the results of section 2 to combine 
in general the Bayesian Poisson distributions for averages with chi-squared distributions 
from Gaussian distributions. In section 7 we apply the method to the analysis of neutrino 
oscillation experiments with small numbers of events. In section 8 we present our conclusions. 

Several appendices complete the necessary tools with expanded probability tables. Oth- 
ers solve the simplest analytic cases for contributions linear in the physical parameters. 
Appendix A reviews the comparison of the integrated probability of the Bayesian Poisson 
distribution for the average with the classical Poisson sum which is often used. Appendix 
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B gives a table of two-sided confidence level limits for the Bayesian Poisson average for a 
single experiment. Appendix C gives a table of chi-squared confidence levels which are use- 
ful for the joint distribution. Appendix D gives the solution for the minimum chi-squared 
for the case that the means only depend linearly on the parameters in both the Poisson 
and Gaussian distributions. Appendix E gives the most probable value and limits for a 
single linear parameter in the combination of one Poisson experiment with one Gaussian 
experiment. Appendix F examines the consistency of converting Poisson to chi-squared dis- 
tributions in the case of combining two Poisson distributions whose averages depend on one 
linear parameter. 

II. METHOD OF JOINING TWO CHI-SQUARED DISTRIBUTIONS 

First we show the result that will allow us to join Poisson distributions for the averages 
when we relate them to chi-squared distributions. We show that the chi-squared distributions 
convolute to form a joint chi-squared distribution. 

The basic chi-squared distribution with N degrees of freedom is 

) = 2 f r( | } » W 

with norm 

1= / d X 2 f N {x 2 )- (2) 
Jo 

The convolution integral for combining two chi-squared distributions for Ni and N 2 to 
produce a joint chi-squared distribution is 

f N {x 2 )= [ X2 d X lf Nl (xl)fN 2 (x 2 -xl) (3) 
Jo 

where x\ is replaced by (x 2 — Xi)- By substituting chi-squared distributions in the above, 
and changing variable the integration variable to t — x\lx 2 -, we get 

«* 2 > = 2 (~, + ^)/.r(f ) r(f) e " x ' /2(x2)< ' Vl+ ' V2)/2 "'i ldt «"' /2_1 (i -t)" 2 ' 2 - 1 . (4) 
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Using the formula for the t integral, which is a beta function equal to 
T(N 1 /2)T(N 2 /2)/T((N 1 + N 2 )/2), one sees that the result f N {x 2 ) is the chi-squared distribu- 
tion function for N = N\ +N 2 . (The analogous formula for joining two Poisson distributions, 
with averages n\ and n 2 to produce n t total events is 



P(n t ;n t ) = ^ P(m; ni)P(n t - ni; n 2 ), (5) 

ni=0 



where n t = n% + n 2 .) 



III. POISSON DISTRIBUTION AND BAYES THEOREM FOR LIMITING AT" 

According to Bayes' Theorem the probability for a given "theoretical parameter 

average" n given an observed number of events n, P(n; n), is proportional to the probability 
of observing n events from a Poisson distribution with an average number of events n, or 
P(n;n) [g. The latter is 

P(n;n) = ^-. (6) 
n! 

The probability distribution for n, P(n;n), is proportional to this 0, subject to the nor- 
malization condition that the probability for all possible n should integrate to unity 







dnP(n- n) = 1. (7) 

This is satisfied by the formula for P(n; n) without further renormalization, since the integral 
is seen to be the form for T(n + l)/n\ = 1. Thus we have the normalized distribution for n 
which we call the Bayesian Poisson distribution for the average |J. 

P{n-n) = r ^-. (8) 

IV. CONNECTION OF THE BAYESIAN POISSON DISTRIBUTION FOR THE 
AVERAGE TO A CHI-SQUARED DISTRIBUTION 



We will show a mapping of the variables (n, n) from a Bayesian Poisson distribution 
for the average to (x 2 , N) for a chi-squared distribution that keeps the identical probability 
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distribution and integration of the Poisson distribution, but is now in a chis-squared form. 
This may be used by itself using usual chi-squared probabilities and contours, or included 
with other chi-squared joined experiments by the convolution integral in section 2. 
The chi-squared distribution to be integrated over dx 2 for N degrees of freedom is 

This is identical to the n distribution to be integrated over n 

P(n;n) = —e~ n n n (10) 



with the identification of 



x 2 

n = — , or x 2 = 2n, (11) 



and 



n = N/2-1, or JV = 2(n + l). (12) 

The equivalency of the two forms is noted in the Particle Data Group article on statistics [|7| , 
but they do not use it to merge experiments into a chi-squared distribution. The identity 
includes the integrals over ranges of probabilities in n or equivalently in x 2 using 

dn = l -d X 2 (13) 

Thus a Poisson with n events now counts mathematically as a chi-squared distribution with 
N = 2n + 2 degrees of freedom. 

If the prior probability is of a logarithmic, power law preserving form preferred by statis- 
ticians, P(n) — 1/n, then the normalized Bayesian Poisson distribution for the average is 
directly seen to be the same as that for the uniform prior for n — 1 events, P(n; n — 1) [|TJ. 
Since the Poisson form was the only requirement for the above connection between Bayesian 
Poisson and chi-squared distributions, the results still hold for the logarithmic prior, but 
with n replaced by n — 1, so that Np^i og = 2n. 



For cases with an unknown mean signal number of events ns plus an exact known 
background average B, the Bayesian Poisson distribution for the mean (72,5 + B) when 
riT events are observed is 

(n s + J B)«r e -(ns+B) 



P(n;n T ) 



(14) 



T(n T + l,B) ' 

where T(nj- + 1, B) is the incomplete Gamma function. This results from the normalization 
over only non-negative values of ns- However, this factor could ruin the simple convolution 
properties on which this paper is based. In cases where B is small and nx a few events, this 
correction is small and the T(ut + can be replaced by with little error, and the 
simple formulas of this paper can again be used with n = tit and n = n^ = ns + B. To see 
when this occurs we note that 



B n T * 



T(n T + 1, B) = n T \ 1 + B + — + . . . + 

V 21 n T 



(15) 



For small B the above correction factor to ny! has leading term (1 — i? nT+1 /(n^ + l)!), giving 
hope of its being small if Ut is not very small and B is. One way to state this is to give the 
value of B for each at which the correction factor becomes a given value. The following 
Table I gives the values at which the correction factor becomes 5% and 1%. 



Table I: B Limits 







1 


2 


3 


4 


5 


6 


7 


8 


9 


10 


5% 


0.05 


0.36 


0.82 


1.37 


1.97 


2.61 


3.29 


3.98 


4.70 


5.43 


6.17 


1% 


0.01 


0.15 


0.44 


0.82 


1.28 


1.79 


2.33 


2.91 


3.51 


4.13 


4.77 



V. DERIVATION OF JOINT PROBABILITY FOR A BAYESIAN POISSON 
DISTRIBUTION FOR THE AVERAGE AND A CHI-SQUARED DISTRIBUTION 

Here we demonstrate the derivation of the product probability for the case of one Poisson 
distribution for the average with a chi-squared distribution for xh with No degrees of free- 
dom formed either from Gaussians or from joint Gaussian and Poisson distributions. The 
integrated product probability is 
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dnP(n;n) / dxafNaiXo) 



(16) 



We convert the integral over the average n to the variable Xp — 2n and rewrite using section 
4 



with Np = 2n + 2. Into the new integral we now introduce the total \ 2 by inserting 
1 = Jq°° dx 2 8(x 2 ~ Xp ~ Xg) anc ^ use this t° do the d\% integral, which limits Xp < X 2 an d 
gives 



By the chi-squared convolution integral, the second integral is fN P +N G (x 2 )^ which is the 
resultant probability distribution for this case, with x 2 — Xp + Xg- The result is an exact 
joint chi-squared probability combining a Poisson experiment with a chi-squared distribution 
from previously combined experiments. 

VI. MERGING B AYES IAN POISSON AND CHI-SQUARED DISTRIBUTIONS 

Now that we have a /jv(x 2 ) distribution Eq.(P) that is equivalent to a Bayesian Poisson 
parameter distribution in value and in its probability integral, we can merge this (indepen- 
dent of its origin) with other chi-squared distributions using Eq. (|3|), the convolution, to 
obtain the final x 2 distribution. 

The results can now be used, for example, in finding x 2 contours corresponding to various 
confidence levels. We must remember that a single Poisson experiment with a uniform prior 
now counts as iV = 2(n + 1) degrees of freedom, where n is the number of observed events 
in the Poisson distribution. While this sounds counter-intuitive, we recall that the form 
of the x 2 distribution that we are using also has x 2 replaced by 2n, and with the above 
replacements, x 2 P er degree of freedom iV or x 2 /N = 2n/(2(n + 1)), approaches 1 at large 
n since n is within \ffi of n. 



dnP(n;n) = f Np {x 2 p)dx 2 p 



(17) 




(18) 
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If Mp is the number of Poisson experiments with rii events in the i'th experiment, we 
associate with each Ni = 2ni + 2 degrees of freedom. We call the associated theoretical 
Poisson averages fij. The total Poisson degrees of freedom becomes 

M P Mp 

N P = y £N i = '£(2m + 2). (19) 

1=1 1=1 

With the alternate choice of a logarithmic prior, N P _i og = X)i=i 2ra». We now convolute 
the Poisson distributions for the average in the chi-squared forms, Eqs. (8-17) with the 
chi-squared distribution of Nq Gaussian experimental degrees of freedom which have a chi- 
squared xh- The result will use the joint chi-square 

M P 

X 2 pg = 2E^ + X 2 g- (20) 

i=l 

From successive convolutions in Eq. ([3D , the combined chi-squared distribution for the Pois- 
son plus Gaussian distributions is finally 

f(N G +N P )(XpG)- ( 21 ) 

We emphasize that these results are an exact treatment, not involving large n or other 
approximations. As in the standard treatment, if N pai is the number of parameters that 
are being fitted, then the number of degrees of freedom is dof = N = Nq + Np — N pa , T . In 
Appendix B we show how the \ 2 limits at various confidence levels for two-sided distributions 
are related to Poisson sums. In Appendix B we give an expanded Table II that can be used 
for two-sided x 2 limits at given confidence levels. In Appendix C an expanded table for 
single-sided \ 2 values or \ 2 contours for N up to 25 corresponding to various confidence 
levels. In the respective appendices we also give Mathematica programs to be used for 
larger iV or other confidence levels. 

This method has been applied in analyzing the constraints of many experiments on new 
flavor changing neutral current models of CP violation in B meson decay asymmetries 0. 
There, all experiments have a Gaussian distribution, except for an experiment [[LJJ] where 
one event has been seen in K + — > 7i + uu and is treated with an additional x 2 p = 2n and 
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adding four degrees of freedom. In that case, n is a function of the down quark mixing 
matrix elements as are the other experiments. That analysis also provides an example of 
the sensitivity to the choice of a uniform or logarithmic prior probability distribution. With 
the uniform prior, the total number of degrees of freedom is seven, and the chi-squared limits 
are at 8.2, 12.0, and 14.3 for 1-a, 90% (1.64-<t), and 2-a confidence levels, respectively. With 
the logarithmic prior, the total number of degrees of freedom decreases by two to five, and 
the chi-squared limits are at 5.89, 9.24, and 11.3, for the 1-a, 90%, and 2-a confidence levels, 
respectively. The chi-squared per degree of freedom ratios stay withing 10% of each other 
between the two cases. However, use of the logarithmic prior does move the contours in by 
two to three units or about 1/2 of a standard deviation, and thus gives tighter bounds. 

Parenthetically we add that in the limit of large n and n, just as the Poisson distribu- 
tion becomes a Gaussian, so does the equivalent chi-squared distribution. The chi-squared 
distribution in Eq. (|9|) becomes 

G(£,a) = e - 7 =^, (22) 



where in our variables a = n 1//2 = (x 2 /2) 1 / 2 , £ = n — n = (N/2 — 1) — x 2 /2, and dx 2 = —2d£. 
Since |£| is confined to the order of a for large n and n, the difference \N/2 — 1 — x 2 /2 1 is 



confined to the order of yx 2 /^ or J N/2 for large N and x ■ 

VII. NEUTRINO OSCILLATION EXPERIMENTS 

Here we shall see that using the combined Poisson method for small numbers of events 
per bin leads to a result which considers only the total number of events in a single Poisson 
distribution, and makes the two methods identical. 



A. Appearance Neutrino Oscillation Experiments 

For example, we consider a — > v e appearance experiment. Let n° be the number of 
expected z/ M in the i'th bin at energy Ei, rii the number of observed events in that bin, and bi 
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the known background in that bin. With the two neutrino oscillation formula, the average 
number of electrons in that bin will be 

fa = n° sin 2 (20) sin 2 (1.278m 2 L/Ei). (23) 

By the method of expressing Bayesian Poisson's in the chi-squared formalism, we get the 
total chi-squared as a linear sum of expected events for each bin from Eq. 14, if the bi are 
sufficiently small 

X 2 = £(2^ + 26,). (24) 

i 

Also, the number of degrees of freedom is twice the total number of observed events n when 
using the logarithmic prior 

N P - log = 2Y i n i = 2n. (25) 

i 

The sum of background events is denoted by B = J2i h- 

With small bin size AE iy n° = (dn/dE)AEi, and the sum of the expected number of 
events at full mixing can be converted into an integral 

rw 9n f ,^dn . 9 ,1.275m 2 L. 
n°(5m 2 ) = J dE— sin 2 ( ). (26) 

So we now have a binning independent form for \ 2 from the sum over bins 

X 2 = 2 sin 2 (26)n (5m 2 ) + 2B (27) 

and a binning independent number of degrees of freedom iV = 2n. The probability distri- 
bution is now 

f N (x 2 ) = / 2 n(2sin 2 (29)n (5m 2 ) + 2B). (28) 

We set 90% CL limits using a one-sided CL if there is no signal, and a two-sided CL if there 
is a signal. For the one-sided CL limit, the average background B has to be less than or 
equal to 0.05 events for the tit = Poisson to be accurately normalized. 
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Going backwards from a chi-squared distribution to its equivalent Poisson distribution, 
this chi-squared result is equivalent to a Bayesian Poisson distribution for the average with 

n = x 2 /2 = sin 2 (26)n (5m 2 ) + B, (29) 

with n events observed. This is the same as the usual approach of grouping all events into 
one bin of the total number of events, which is used if there are few events. As in the case 
of Eq. 14, B must be small enough not to significantly affect the normalization. 

B. Disappearance Neutrino Oscillation Experiments 

For a disappearance experiment, the expected number of events per bin is 

ni = - sin 2 (26) sin 2 (1.278m 2 L/Ei)). (30) 

Using the same sums as for the appearance experiment, and defining the sum of the coef- 
ficients of the 1 term or the total number of expected neutrino events without oscillation 

as 

»° = /^ < 31 > 

we have the probability distribution 

f N (x 2 ) = / 2 „(2n° - 2 sin 2 (29)n°(5m 2 ) + 2B). (32) 

If the total number of events is large enough to use a Gaussian approximation, these 
are then the same results as using a single Gaussian in the usual method for comparing the 
total number of events with and without oscillation. But even with a limited total number 
of events, the formulas above with a chi-squared distribution are an improvement over a 
Gaussian, as long as the background 6, are small enough in each bin. 
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C. General Comments on Oscillation Results 



What we have achieved is that for a small number of events, we have found the Xp 
for the Poisson neutrino appearance and disappearance experiments, Eqs. (27) and (32), 
respectively, that can be added to x 2 from other Poisson or Gaussian neutrino experiments 
to determine neutrino oscillation and mixing parameters using standard x 2 methods with 
2n extra degrees of freedom as in the logarithmic prior case. The drawback is that the result 
is equivalent to a comprehensive bin in energy containing all events. When the number of 
particles per each energy bin becomes significant, it is better to use a Gaussian for each bin 
to derive information contained in the detailed energy spectrum. 

D. One-Sided Chi-squared Limits on Oscillation 

We find a contour in the (sin 2 (20), 5m 2 ) plane where for the probability distribution 
f2n{x 2 ) the amount of probability contained in the major part is the confidence level CL. The 
appropriate one-sided chi-squared limits XcL+i^n) for n observed events and N P = 2(n + 1) 
for a uniform prior or Np_i og = 2n for a logarithmic prior are found in Table III of Appendix 
C. 

1. Appearance Experiment 

In practice, for each 5m 2 we find the value of sin 2 (29) such that the bound becomes an 
equality 

2 sin 2 (29)n (5m 2 ) + 2B < x 2 CL +( 2n )- ( 33 ) 

CL + means that for a 95% CL limit, only 5% is left off of the upper part of the distribution. 
The excluded region is where the left-hand-side is larger than the chi-squared upper CL 
limit, giving an upper bound on sin 2 (29). 
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2. Disappearance Experiment 



Here, the excluded region is where the left-hand-side is smaller than the chi-squared 
lower CL limit, and the allowed region is 

2n° - 2 sin 2 (29)n°(5m 2 ) + 2B > X 2 CL - ( 2 ™), (34) 

which again restricts the result with an upper bound on sin 2 (29). 

E. Large n Gaussian Approximation 

While the previous results were accurate for small b iy for large n we may use the ap- 
proximation that the chi-squared distribution resembles a Gaussian distribution near its 
peak 

hnix 2 ) - -jL= exp (- (w - * 2/2)2 ). (35) 

A one-sided 95% CL limit which leaves 5% on one side, is at the same deviation (in x 2 /2) 
from the center of the Gaussian as the usual two-sided 90% CL limit which leaves 5% on 
both sides. This occurs at 



\n 



X 2 /2\ = 1.64(7 = 1.64-s/n. (36) 



This yields the chi-squared limits below. Since the multiplier term of sin 2 (29) can average 
to a half or be less than that, values of sin 2 (29) greater than one can be reached in these 
limits, and they must be cut off at one. 

For appearance experiments, the two sided 90% CL limits are 

9/ n \ n + 1.64a/™ — B . „. 

sin 2 (29) < — ^ . 37 

n u (dm z ) 

For disappearance experiments, the two sided 90% CL limits are 

sm 2 2^ < — — 38 

rr(om 2 ) 

For one-sided 90% CL limits we use 1.28a. 
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VIII. CONCLUSIONS 



In conclusion, we have shown how the simplicity of the Bayesian x 2 analysis can be 
exactly extended to include experiments with a small number of events which are described 
by a Bayesian Poisson distribution for the average. This precise analytic treatment (provided 
the background is small) is useful since it uses the simple chi-squared treatment for all 
experiments, even if some experiments have too few events to be a standard Gaussian. We 
have provided useful tables for the method by extending them to larger n to accompany the 
larger number of degrees of freedom used. We have analyzed neutrino oscillation experiments 
and showed how the analytic combination of Poisson bins through the equivalent chi-squared 
distributions leads to the standard Poisson result for the total number of events. However, 
using the equivalence to a chi-squared distribution, we have found the appropriate Xp to 
add to the \ 2 from other experiments to use standard \ 2 methods. 
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APPENDIX A: COMPARISON WITH OTHER FORMULAS USED FOR 

POISSON PARAMETER LIMITS 

For completeness we include here some properties of and a comparison between the 
classical (or frequentist) and Bayesian Poisson limits on n. The methods are given full 
discussion by R. D. Cousins in Ref. 1. The classical Poisson parameter distribution used 
for the upper n limit is to sum the Poisson distributions P(n; n) from n + 1 events to 
infinity, when the number of observed events is n, and use it as the probability for n when 
n is greater than n. We show that the Bayesian Poisson parameter distribution Eq. (^) 
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integrated from zero to a cutoff n c agrees with the above formulation |[J]. First we do the 
integrated probability for n from n c to infinity by integrating e~ n by parts 

I(n c ;n) = / dn— e~ n (Al) 

Jn r TV. 



n! 



00 coo J7, n_1 



n c •> n. 



+ / dn- — e~ n (A2) 



(n- 1)! 



= P(n;n c )+I(n c ;n-l). (A3) 

Continued integration by parts shows that the integral over a semi-infinite interval beginning 
at n c of the Bayesian Poisson parameter distribution is Hi1.[iT| 

I(n c ; 71) = P(n; n c ) + P(n - 1; n c ) + . . . + P(0; n c ). (A4) 

The two methods are now seen to be equivalent using n = n c and the fact that the Poisson 
terms sum M to 1 



n 



P{n'] n) = 1 - Yl p ( n '; ") ( A5 ) 

n'=n+l n'=0 

= 1-J(n;n) (A6) 

dn— e~ n . (A7) 
n! 



from Eqs. (0) and (A3). 



For n the number of observed events, the rule for the "1-cr" upper limit on n c is to find 
n+ such that 84% of the time there would be greater than n events. Since "1-cr" means 32% 
is outside the central region, 16% should occur on one side. Thus the sum from n + 1 to 
infinity is set equal to 0.84 

Y P(n';nt) = / dn— e~ n = 1 - I(nt, n) = 0.84 (A8) 

from Eq. (A6). So for the upper "1-cr" limit, n+, both the Bayesian result of setting the 
integral of the Poisson distribution for the average in Eq. (|A~7|) equal to 0.84 and the sum of 
higher n agree. 

For the lower 1-cr, the classical rule of setting the sum from to n — 1 equal to 0.84 to 
determine n~ (or the sum from n to inf set to 0.16) gives 
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n-1 

P{ri;n-) = I(n-;n-l) = OM. (A9) 

n'=0 

This is not the same as setting the integral of the Bayesian Poisson distribution for the 
average from to n~ equal to 0.16 

1 - I(n~; n) = J dfl ~\ e ~ n = °- 16 or J fc n ) = 0M ( A1 °) 
from Eq. (|A7|). To see the difference, we note from Eq . (|A3|) 

I{n~;n) = P{n;n~) + I(n~;n- 1). (All) 

With the prior chosen to be 1/n, the lower limits agree but not the upper ffl. 

APPENDIX B: TABLE OF BAYESIAN POISSON CENTRAL LIMITS FOR THE 
AVERAGE AND TWO-SIDED CHI-SQUARED LIMITS 

The Bayesian Poisson average central interval limits with uniform prior are the upper or 
lower nf limits as in Eq. (A8) or Eq. (A10) beyond which the confidence level is below a 
given value. This is in analogy with the x ± a one a limits in a single Gaussian distribution, 
where half of the excluded intervals on each side are used in the integral limits (0.16 on each 
side for la). The following Table II covers lower and upper limits out to 3a, and for n = 
to n = 24. 

Comparing Eq. Al with the results of section 4 we have the relation between the Poisson 
integral over the average and the equivalent chi-squared integral at a given confidence level, 
say CL + 

roc ff, 71 roc 

7(n+; n)= dn—e~ n = / d X 2 f N (x 2 ) = CL + , (Bl) 

with (Xc) + = 2n+ and N = 2(n + 1). For the lower confidence level limits 

1 - J(n c -; n) = / dn— e~ n = / d X 2 f N (x 2 ) = CL~. (B2) 
Jo n\ Jo 

So in both cases, we can get the \ 2 limits from Table II also by using 
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(Xlf = 2nf. (B3) 

Table II was produced from the following Mathematica program (except for the n col- 
umn), which can be used to extend the table as needed. It also shows the actual confidence 
levels used for the various column designations in the program. 

<< Statistics' ContinuousDistributions' 
cl = {0.0013499, 0.01, 0.0227501, 0.1, 0.158655, 0.5, 0.841345, 0.9, 0.9772500, 0.99, 0.9996500} 
navgtable := N[Table[0.5 * Quantile[ChiSquareDistribution[k], cl[[i]]], {k, 4, 50, 2}, {i, 1, 11}], 4] 
TeXFormfnavgtable/ /TableForm]. 

For n = events observed, the one-sided confidence interval upper bounds are meaningful 
as opposed to two-sided intervals. The upper limits of intervals starting from zero which 
contain 0.6827, 0.90, 0.95, 0.9545, 0.99, and 0.9973 probability are 1.15, 2.30, 3.00, 3.09, 4.61, 
and 5.9, respectively. G. J. Feldman and R. D. Cousins use an approach which carefully 
covers both single and double-sided cases ||. 
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Table II: Bayesian Poisson Central Limits for the Averages n 


„ and 


71+ 


n 


-3a 


0.01 


-2(7 


0.1 


-la 


0.5 


la 


0.9 


2a 


0.99 


3a 


1 


0.05288 


0.1486 


0.2301 


0.5318 


0.7082 


1.678 


3.300 


3.890 


5.683 


6.638 


10.39 


2 


0.2117 


0.4360 


0.5963 


1.102 


1.367 


2.674 


4.638 


5.322 


7.348 


8.406 


12.47 


3 


0.4653 


0.8232 


1.058 


1.745 


2.086 


3.672 


5.918 


6.681 


8.902 


10.05 


14.38 


4 


0.7919 


1.279 


1.583 


2.433 


2.840 


4.671 


7.163 


7.994 


10.39 


11.60 


16.18 


5 


1.175 


1.785 


2.153 


3.152 


3.620 


5.670 


8.382 


9.275 


11.82 


13.11 


17.90 


6 


1.603 


2.330 


2.758 


3.895 


4.419 


6.670 


9.584 


10.53 


13.22 


14.57 


19.56 


7 


2.068 


2.906 


3.391 


4.656 


5.232 


7.669 


10.77 


11.77 


14.59 


16.00 


21.17 


8 


2.563 


3.507 


4.046 


5.432 


6.057 


8.669 


11.95 


12.99 


15.94 


17.40 


22.75 


9 


3.084 


4.130 


4.719 


6.221 


6.891 


9.669 


13.11 


14.21 


17.27 


18.78 


24.30 


10 


3.628 


4.771 


5.409 


7.021 


7.734 


10.67 


14.27 


15.41 


18.58 


20.14 


25.82 


11 


4.191 


5.428 


6.113 


7.829 


8.585 


11.67 


15.42 


16.60 


19.87 


21.49 


27.32 


12 


4.772 


6.099 


6.828 


8.646 


9.441 


12.67 


16.56 


17.78 


21.16 


22.82 


28.80 


13 


5.367 


6.782 


7.555 


9.470 


10.30 


13.67 


17.70 


18.96 


22.43 


24.14 


30.26 


14 


5.977 


7.477 


8.291 


10.30 


11.17 


14.67 


18.83 


20.13 


23.70 


25.45 


31.70 


15 


6.599 


8.181 


9.036 


11.14 


12.04 


15.67 


19.96 


21.29 


24.95 


26.74 


33.13 


16 


7.233 


8.895 


9.789 


11.98 


12.92 


16.67 


21.08 


22.45 


26.20 


28.03 


34.55 


17 


7.877 


9.616 


10.55 


12.82 


13.80 


17.67 


22.20 


23.61 


27.44 


29.31 


35.95 


18 


8.530 


10.35 


11.32 


13.67 


14.68 


18.67 


23.32 


24.76 


28.68 


30.58 


37.34 


19 


9.193 


11.08 


12.09 


14.53 


15.57 


19.67 


24.44 


25.90 


29.90 


31.85 


38.72 


20 


9.863 


11.83 


12.87 


15.38 


16.45 


20.67 


25.55 


27.05 


31.13 


33.10 


40.10 


21 


10.54 


12.57 


13.65 


16.24 


17.35 


21.67 


26.66 


28.18 


32.34 


34.35 


41.46 


22 


11.23 


13.33 


14.44 


17.11 


18.24 


22.67 


27.76 


29.32 


33.55 


35.60 


42.82 


23 


11.92 


14.09 


15.23 


17.97 


19.14 


23.67 


28.87 


30.45 


34.76 


36.84 


44.17 


24 


12.62 


14.85 


16.03 


18.84 


20.03 


24.67 


29.97 


31.58 


35.96 


38.08 


45.51 
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APPENDIX C: TABLE FOR CHI-SQUARED VALUES AT VARIOUS 

CONFIDENCE LEVELS 

Since the joint method for n events requires x 2 for iV = 2{n + 1) + N G — N par for a 
uniform prior, or for N = 2n + N G — N par for a logarithmic prior, both of which can be 
large, we give here a table of chi-squared values for various confidence levels for large N up 
to 25, and a program with which one can generate further limits. 

In the following table, N is the number of degrees of freedom, and the designations of 
1, 2, and 3 a correspond to 1-CL of 0.682689, 0.954500, and 0.997300, respectively. The 
Mathematica program used to generate the table is 

< < Statistics' ContinuousDistributions' 
cl = {0.682689,0.9,0.954500,0.99,0.997300} 
cstable := N [Table [Quantile [ChiSquareDistribution [k] , cl[[i]]], {k, 1, 25}, {i, 1, 5}], 4] 
TeXForm[cstable / /TableForm] . 
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Table III: Chi-squared Limits 


N 


la 


0.90 


2a 


0.99 


3a 


1 


1.000 


2.706 


4.000 


6.635 


9.000 


2 


2.296 


4.605 


6.180 


9.210 


11.83 


3 


3.527 


6.251 


8.025 


11.34 


14.16 


4 


4.719 


7.779 


9.716 


13.28 


16.25 


5 


5.888 


9.236 


11.31 


15.09 


18.21 


6 


7.038 


10.64 


12.85 


16.81 


20.06 


7 


8.176 


12.02 


14.34 


18.48 


21.85 


8 


9.304 


13.36 


15.79 


20.09 


23.57 


9 


10.42 


14.68 


17.21 


21.67 


25.26 


10 


11.54 


15.99 


18.61 


23.21 


26.90 


11 


12.64 


17.28 


19.99 


24.72 


28.51 


12 


13.74 


18.55 


21.35 


26.22 


30.10 


13 


14.84 


19.81 


22.69 


27.69 


31.66 


14 


15.94 


21.06 


24.03 


29.14 


33.20 


15 


17.03 


22.31 


25.34 


30.58 


34.71 


16 


18.11 


23.54 


26.65 


32.00 


36.22 


17 


19.20 


24.77 


27.95 


33.41 


37.70 


18 


20.28 


25.99 


29.24 


34.81 


39.17 


19 


21.36 


27.20 


30.52 


36.19 


40.63 


20 


22.44 


28.41 


31.80 


37.57 


42.08 


21 


23.51 


29.62 


33.07 


38.93 


43.52 


22 


24.59 


30.81 


34.33 


40.29 


44.94 


23 


25.66 


32.01 


35.58 


41.64 


46.36 


24 


26.73 


33.20 


36.83 


42.98 


47.76 


25 


27.80 


34.38 


38.07 


44.31 


49.16 



APPENDIX D: SOLUTION FOR CHI-SQUARED EXPANSION ABOUT THE 
MINIMUM FOR THE LINEAR PARAMETER DEPENDENCE CASE 



For the case where the theoretical values for the mean in the Gaussian and Poisson 
distributions are linear in parameters to be fitted, the minimum of x 2 and its quadratic 
expansion about the minimum can be found analytically using the same method as for pure 



Gaussian distributions |T^,[T3[. While this may prove useful, in the usage here, however, the 



maximal probability of the x 2 distribution is not at the minimum x 2 , but at x 2 ~ n - 

In the method of expressing Poisson distributions for the average as x 2 distributions in 
this paper, the final Xgp * s 

X G = 1^ 2 ' alld ( D1 ) 

i=l a i 

N P 

Xgp = Xg + ( D2 ) 
1=1 

where a is the set of k parameters a m . The experiments described by {y^ Fj) can even be 
totally different, and the Fi and ne are assumed to be linearly expandable in the parameters 

k 

F i( a ) = a nfin, and (D3) 

71=1 

k 

nei a ) = J2 n ej a j- ( D4 ) 

Minimizing Xgp w hh respect to each a m gives rise to the vector g and matrix V^ 1 with 
components 

N G £ Np 

9m = J2yi~^T ~J2 n e™, and (D5) 
i=i a i e=i 

N G f f 

vr — 1 Jinjim tT\r\ 

V mn=2^—3-- ( D6 ) 

1=1 °i 

Using the inverse matrix V, the values of the parameters that give the minimum Xgp are 
given by 

a = Vg, (D7) 
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with the effect of the fik terms entering through g. The minimum value of Xgp * s Eq. ( P2|) 
evaluated at a — a. Xgp can then be rewritten in terms of a away from the minimum values 

as 

XGP = XGP-min + (u-a) T V-\a-a). (D8) 

APPENDIX E: SOLUTION OF ONE BAYESIAN POISSON DISTRIBUTION 
WITH ONE GAUSSIAN DISTRIBUTION AND ONE LINEAR PARAMETER 

We present here the solution for the single linear parameter case with one Bayesian Pois- 
son and one Gaussian distribution. For the unknown parameter a, we have the theoretical 
relations n = acp for the Poisson average, and x = ac with known standard deviation a for 
the Gaussian average, where coefficients cp and c are given, and n and x are the results of 
the respective experiments. Then 

xl G = 2n + (x- xfjo 2 (El) 

With one parameter to be fitted, the number of joint degrees of freedom with the equivalent 
chi-squared method with a uniform prior isiV = 2n + 2 + i — 1 = 2n + 2 where one degree of 
freedom is cancelled by the one parameter. For the logarithmic prior, iV = 2n + l — 1= 2n, 
which gives tighter \ 2 limits. 

The minimum of Xpg occurs at 

ac = x — a 2 cp / c (E2) 

giving the minimum chi-squared 

X 2 min = 2x(c P /c) - a 2 (c P /c) 2 . (E3) 

When Xpg ls se ^ equal to a certain upper limit boundary at x\ m i there are bounds on 
the range of a given by 
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For physical reasons we may want a to be positive when cp and c are positive. Looking at 
x = ac above, we see that x and a are positive when xx/a 2 > n. In order to use a Gaussian, 
we expect at least a 3-cr separation of the peak from zero, or x/cr > 3 and x/a > 3. Thus 
for n < 9, this method works and a > 0. For n > 9, n/y/fi > 3 and we can start using a 
Gaussian instead of a Poisson for the n experiment. The same reasoning follows through if 
for example we require a 5-cr separation from zero to use a Gaussian. 

APPENDIX F: TWO POISSON DISTRIBUTIONS WITH ONE LINEAR 

PARAMETER 

We approach this problem both from Bayes theorem directly, and from converting the 
Bayesian Poisson distributions to chi-squared distributions as proposed in this paper. For 
the latter we then merge the chi-squared distributions to a single chi-square distribution for 
the linear parameter and then convert that back to a joint Poisson distribution, to compare 
to the direct approach. For the case of the logarithmic prior we find consistency. 

The averages of the experiments are theoretically given by the parameter a with respec- 
tive known coefficients fi\ = ac\ and n 2 = ac 2 . The direct Bayesian result is proportional to 
the probability for observing the experimental values n\ and n 2 given a value of a 

Prob(a; ni, n 2 ) = P{ni, aci)P(n 2 ; ac 2 )P{a)/ (P(ni)P(n 2 )) 
oc (a Cl ) ni (ac 2 ) n2 e- aCl e~ aC2 P(a) 
oc (a(ci + c 2 )) {ni+n2) e- a(ci+C2) P(a) 

oc P(a(ci + c 2 );n! + n 2 )P(a). (Fl) 

For the uniform prior, P(a) = 1, the normalized result is P(a(c 1 + c 2 ); n\ + n 2 ), integrating 
over da{c\ + c 2 ). For the logarithmic prior with P(a) = 1/a, the normalized result is the 
same as the uniform prior with total n lowered by 1, or P{a{c\ + c 2 ); rt\ + n 2 — 1), integrating 
over da{c\ + c 2 ). 
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If we now start with the method in this paper, we take the joint Bayesian result 
as the product of the Bayesian Poisson for each experiment as if they were indepen- 
dent, P(aci, ni)P(ac2', n 2 ) times either the uniform prior dfi\dn 2 or the logarithmic prior 
dfi\dn 2 1 '{fi\n 2 ) ■ The logarithmic prior is equivalent to P(aci,rii — l)P(ac 2 ;n 2 — 1) with a 
uniform prior. Converting the uniform case to chi-squared distributions gives the convolu- 
tion of the product j 2ni+2 (laci) / 2n2+2 (lac 2 ) leading to f 2ni+2n2+A (laci + 2ac 2 ). Converting 
this back to a Poisson distribution for the average gives P{ac\ + ac 2 ;rii + n 2 + 1) for the 
uniform prior, which is inconsistent with the direct uniform Bayesian result in the previous 
paragraph. For the logarithmic prior, converting to chi-squared distributions gives the con- 
volution of the product f 2ni (^CLCi)f 2n2 (lac 2 ) which is f 2ni +2n 2 (^ ac i + 2ac 2 ). Converting this 
back to a Poisson distribution for the average gives 

da(c\ + c 2 ) 

Piaci + ac 2 ; ni + n 2 - 1) oc Piaci + ac 2 ; n x + n 2 )— -, {hi) 

a{d + c 2 ) 

which is consistent with the direct Bayesian result for the logarithmic prior in the previous 
paragraph. 

In the combined form as a single Bayesian Poisson distribution for the average, both 
upper and lower limits on a for a given central confidence interval can be found using the 
table in appendix B. The case where no events were observed in either experiment can also 
be dealt with using one-sided bounds, which are also given in appendix B. 



24 



REFERENCES 



[1] An excellent discussion of classical and Bayesian methods, including the Bayesian Pois- 
son distribution for the average, which also contains many references is R. D. Cousins, 
Am. J. Phys. 63, 398 (1985). 

[2] Introductory lectures on the web: "Probability and Measurement Uncertainty in Physics 
- a Bayesian Primer", G. D'Agostini, |hep-ph/9512295| (1995). 



[3] G. J. Feldman and R. D. Cousins, Phys. Rev. D 57, 3873 (1998). 

[4] This is similar to the approach used by S. Baker and R. D. Cousins, Nucl. Inst, and 
Meth. 221 437, (1984), except that they use a likelihood function method which is the 
ratio of the Poisson distribution to the Poisson distribution with the true mean. 

[5] Bayes' Theorem also contains the ratio of the a priori probabilities of P(n)/P(n). We 
take both as unity, or P(n) = 1, which is called a uniform prior. 

[6] L. J. Rainwater and C. S. Wu, Nucleonics 1, 60 (1947). 

[7] Particle Data Group, Phys. Rev. D 54 Part I, (1996) p. 164, beneath Fig. 28.4. 
[8] Nucl. Inst, and Meth. 228, 120 (1984). 

[9] D. Silverman, "The Full Range of Predictions for B Physics From Iso-singlet Down 
Quark Mixing" , [Eep-ph/9806489| , Phys. Rev. D 58, 095006 (1998). 



[10] BNL E787, S. Adler, et al, Phys. Rev. Lett. 79, 2204 (1997). 

[11] This formula is in M. Abramowitz and I. A. Stegun, Handbook of Mathematical Func- 
tions, NBS, formula 26.4.2, for Q(2n c \2n + 2) with c — 1 = n using our replacements, 
and in Ref.l, Eq.(18). 

[12] For the pure Gaussian case, see for example Jon Mathews and R. L. Walker, Mathe- 
matical Methods of Physics, Second Edition, Section 14-7, Addison- Wesley (1970). 



25 



[13] Here we use notation similiar to the Particle Data Group, C. Caso et al, European 
Physical Journal C3, 1 (1998), Sec. 29.5, and [http: / / www-pdg.lbl.gov/ 1998 / statrpp-| 
partl.ps| . 



26 



